Overview

Dataset statistics

Number of variables16
Number of observations100116
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.2 MiB
Average record size in memory128.0 B

Variable types

Numeric9
Categorical7

Warnings

id is highly correlated with customer_idHigh correlation
customer_id is highly correlated with idHigh correlation
is_cardiologist is highly correlated with is_gpHigh correlation
is_gp is highly correlated with is_cardiologistHigh correlation
gender_female is highly correlated with gender_maleHigh correlation
gender_male is highly correlated with gender_femaleHigh correlation
office_or_hospital_based_Hospital is highly correlated with office_or_hospital_based_OfficeHigh correlation
office_or_hospital_based_Office is highly correlated with office_or_hospital_based_HospitalHigh correlation
id is highly correlated with customer_idHigh correlation
customer_id is highly correlated with idHigh correlation
is_cardiologist is highly correlated with is_gpHigh correlation
is_gp is highly correlated with is_cardiologistHigh correlation
gender_female is highly correlated with gender_maleHigh correlation
gender_male is highly correlated with gender_femaleHigh correlation
office_or_hospital_based_Hospital is highly correlated with office_or_hospital_based_OfficeHigh correlation
office_or_hospital_based_Office is highly correlated with office_or_hospital_based_HospitalHigh correlation
id is highly correlated with customer_idHigh correlation
customer_id is highly correlated with idHigh correlation
is_cardiologist is highly correlated with is_gpHigh correlation
is_gp is highly correlated with is_cardiologistHigh correlation
gender_female is highly correlated with gender_maleHigh correlation
gender_male is highly correlated with gender_femaleHigh correlation
office_or_hospital_based_Hospital is highly correlated with office_or_hospital_based_OfficeHigh correlation
office_or_hospital_based_Office is highly correlated with office_or_hospital_based_HospitalHigh correlation
is_gp is highly correlated with rep_id and 3 other fieldsHigh correlation
rep_id is highly correlated with is_gp and 1 other fieldsHigh correlation
id is highly correlated with customer_idHigh correlation
gender_male is highly correlated with gender_femaleHigh correlation
office_or_hospital_based_Office is highly correlated with is_gp and 2 other fieldsHigh correlation
customer_id is highly correlated with idHigh correlation
gender_female is highly correlated with gender_maleHigh correlation
office_or_hospital_based_Hospital is highly correlated with is_gp and 2 other fieldsHigh correlation
is_cardiologist is highly correlated with is_gp and 3 other fieldsHigh correlation
is_gp is highly correlated with is_cardiologistHigh correlation
gender_female is highly correlated with gender_maleHigh correlation
office_or_hospital_based_Hospital is highly correlated with office_or_hospital_based_OfficeHigh correlation
gender_male is highly correlated with gender_femaleHigh correlation
office_or_hospital_based_Office is highly correlated with office_or_hospital_based_HospitalHigh correlation
is_cardiologist is highly correlated with is_gpHigh correlation
id is uniformly distributed Uniform
customer_id is uniformly distributed Uniform
email_open_total has 36890 (36.8%) zeros Zeros
f2f_total has 33974 (33.9%) zeros Zeros
prescription_total has 86554 (86.5%) zeros Zeros
webinar_total has 93792 (93.7%) zeros Zeros

Reproduction

Analysis started2021-05-28 21:04:16.265159
Analysis finished2021-05-28 21:04:34.087162
Duration17.82 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

id
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM

Distinct8343
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4172.636821
Minimum0
Maximum8348
Zeros12
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size782.3 KiB
2021-05-28T23:04:34.256999image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile417
Q12085
median4172
Q36261
95-th percentile7929
Maximum8348
Range8348
Interquartile range (IQR)4176

Descriptive statistics

Standard deviation2410.077866
Coefficient of variation (CV)0.5775910938
Kurtosis-1.200370264
Mean4172.636821
Median Absolute Deviation (MAD)2088
Skewness0.0005144702163
Sum417747708
Variance5808475.318
MonotonicityIncreasing
2021-05-28T23:04:34.368823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
012
 
< 0.1%
548012
 
< 0.1%
132212
 
< 0.1%
746512
 
< 0.1%
541612
 
< 0.1%
330712
 
< 0.1%
125812
 
< 0.1%
740112
 
< 0.1%
535212
 
< 0.1%
324312
 
< 0.1%
Other values (8333)99996
99.9%
ValueCountFrequency (%)
012
< 0.1%
112
< 0.1%
212
< 0.1%
312
< 0.1%
412
< 0.1%
512
< 0.1%
612
< 0.1%
712
< 0.1%
812
< 0.1%
912
< 0.1%
ValueCountFrequency (%)
834812
< 0.1%
834712
< 0.1%
834612
< 0.1%
834512
< 0.1%
834412
< 0.1%
834312
< 0.1%
834212
< 0.1%
834112
< 0.1%
834012
< 0.1%
833912
< 0.1%

customer_id
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIFORM

Distinct8343
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4272.636821
Minimum100
Maximum8448
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size782.3 KiB
2021-05-28T23:04:34.488645image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile517
Q12185
median4272
Q36361
95-th percentile8029
Maximum8448
Range8348
Interquartile range (IQR)4176

Descriptive statistics

Standard deviation2410.077866
Coefficient of variation (CV)0.5640727182
Kurtosis-1.200370264
Mean4272.636821
Median Absolute Deviation (MAD)2088
Skewness0.0005144702163
Sum427759308
Variance5808475.318
MonotonicityIncreasing
2021-05-28T23:04:34.606734image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
204912
 
< 0.1%
375512
 
< 0.1%
784912
 
< 0.1%
580012
 
< 0.1%
369112
 
< 0.1%
164212
 
< 0.1%
778512
 
< 0.1%
573612
 
< 0.1%
362712
 
< 0.1%
157812
 
< 0.1%
Other values (8333)99996
99.9%
ValueCountFrequency (%)
10012
< 0.1%
10112
< 0.1%
10212
< 0.1%
10312
< 0.1%
10412
< 0.1%
10512
< 0.1%
10612
< 0.1%
10712
< 0.1%
10812
< 0.1%
10912
< 0.1%
ValueCountFrequency (%)
844812
< 0.1%
844712
< 0.1%
844612
< 0.1%
844512
< 0.1%
844412
< 0.1%
844312
< 0.1%
844212
< 0.1%
844112
< 0.1%
844012
< 0.1%
843912
< 0.1%

rep_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct153
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean154.2720844
Minimum100
Maximum252
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size782.3 KiB
2021-05-28T23:04:34.909804image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile102
Q1124
median149
Q3185
95-th percentile218
Maximum252
Range152
Interquartile range (IQR)61

Descriptive statistics

Standard deviation36.60864628
Coefficient of variation (CV)0.2372992264
Kurtosis-0.8447485583
Mean154.2720844
Median Absolute Deviation (MAD)29
Skewness0.3773270901
Sum15445104
Variance1340.192982
MonotonicityNot monotonic
2021-05-28T23:04:35.022096image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1024320
 
4.3%
1452412
 
2.4%
1912100
 
2.1%
1472100
 
2.1%
1252064
 
2.1%
1291716
 
1.7%
1511620
 
1.6%
1241584
 
1.6%
1051512
 
1.5%
1531512
 
1.5%
Other values (143)79176
79.1%
ValueCountFrequency (%)
1001092
 
1.1%
101948
 
0.9%
1024320
4.3%
103396
 
0.4%
104372
 
0.4%
1051512
 
1.5%
1061476
 
1.5%
1071428
 
1.4%
108660
 
0.7%
109348
 
0.3%
ValueCountFrequency (%)
25212
 
< 0.1%
25112
 
< 0.1%
25012
 
< 0.1%
24912
 
< 0.1%
24812
 
< 0.1%
24736
 
< 0.1%
24612
 
< 0.1%
24512
 
< 0.1%
244168
0.2%
24312
 
< 0.1%

is_cardiologist
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size782.3 KiB
0
88080 
1
12036 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters100116
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
088080
88.0%
112036
 
12.0%

Length

2021-05-28T23:04:35.226479image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-28T23:04:35.281784image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
088080
88.0%
112036
 
12.0%

Most occurring characters

ValueCountFrequency (%)
088080
88.0%
112036
 
12.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100116
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
088080
88.0%
112036
 
12.0%

Most occurring scripts

ValueCountFrequency (%)
Common100116
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
088080
88.0%
112036
 
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII100116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
088080
88.0%
112036
 
12.0%

is_gp
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size782.3 KiB
1
88080 
0
12036 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters100116
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
188080
88.0%
012036
 
12.0%

Length

2021-05-28T23:04:35.438407image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-28T23:04:35.494774image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
188080
88.0%
012036
 
12.0%

Most occurring characters

ValueCountFrequency (%)
188080
88.0%
012036
 
12.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100116
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
188080
88.0%
012036
 
12.0%

Most occurring scripts

ValueCountFrequency (%)
Common100116
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
188080
88.0%
012036
 
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII100116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
188080
88.0%
012036
 
12.0%

years_since_graduation
Real number (ℝ≥0)

Distinct60
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.08845739
Minimum3
Maximum68
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size782.3 KiB
2021-05-28T23:04:35.569728image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile8
Q117
median28
Q337
95-th percentile46
Maximum68
Range65
Interquartile range (IQR)20

Descriptive statistics

Standard deviation12.36977458
Coefficient of variation (CV)0.4566437433
Kurtosis-0.9051889658
Mean27.08845739
Median Absolute Deviation (MAD)10
Skewness0.007919249722
Sum2711988
Variance153.0113232
MonotonicityNot monotonic
2021-05-28T23:04:35.686730image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
322964
 
3.0%
312952
 
2.9%
332940
 
2.9%
352928
 
2.9%
362856
 
2.9%
302832
 
2.8%
382736
 
2.7%
262652
 
2.6%
392640
 
2.6%
112616
 
2.6%
Other values (50)72000
71.9%
ValueCountFrequency (%)
312
 
< 0.1%
4504
 
0.5%
51032
 
1.0%
61608
1.6%
71812
1.8%
82388
2.4%
92424
2.4%
102220
2.2%
112616
2.6%
122124
2.1%
ValueCountFrequency (%)
6812
 
< 0.1%
6324
 
< 0.1%
6024
 
< 0.1%
5960
 
0.1%
5860
 
0.1%
5748
 
< 0.1%
56120
0.1%
55192
0.2%
54240
0.2%
53288
0.3%

time_window_id
Real number (ℝ≥0)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size782.3 KiB
2021-05-28T23:04:35.790158image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13.75
median6.5
Q39.25
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.45206977
Coefficient of variation (CV)0.5310876569
Kurtosis-1.216784055
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum650754
Variance11.9167857
MonotonicityNot monotonic
2021-05-28T23:04:35.875345image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
18343
8.3%
28343
8.3%
38343
8.3%
48343
8.3%
58343
8.3%
68343
8.3%
78343
8.3%
88343
8.3%
98343
8.3%
108343
8.3%
Other values (2)16686
16.7%
ValueCountFrequency (%)
18343
8.3%
28343
8.3%
38343
8.3%
48343
8.3%
58343
8.3%
68343
8.3%
78343
8.3%
88343
8.3%
98343
8.3%
108343
8.3%
ValueCountFrequency (%)
128343
8.3%
118343
8.3%
108343
8.3%
98343
8.3%
88343
8.3%
78343
8.3%
68343
8.3%
58343
8.3%
48343
8.3%
38343
8.3%

conference_total
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size782.3 KiB
0
90525 
1
9152 
2
 
428
3
 
11

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters100116
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
090525
90.4%
19152
 
9.1%
2428
 
0.4%
311
 
< 0.1%

Length

2021-05-28T23:04:36.048468image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-28T23:04:36.106882image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
090525
90.4%
19152
 
9.1%
2428
 
0.4%
311
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
090525
90.4%
19152
 
9.1%
2428
 
0.4%
311
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100116
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
090525
90.4%
19152
 
9.1%
2428
 
0.4%
311
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common100116
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
090525
90.4%
19152
 
9.1%
2428
 
0.4%
311
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII100116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
090525
90.4%
19152
 
9.1%
2428
 
0.4%
311
 
< 0.1%

email_open_total
Real number (ℝ≥0)

ZEROS

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.995505214
Minimum0
Maximum7
Zeros36890
Zeros (%)36.8%
Negative0
Negative (%)0.0%
Memory size782.3 KiB
2021-05-28T23:04:36.167109image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile3
Maximum7
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.9942147861
Coefficient of variation (CV)0.9987037458
Kurtosis0.913514897
Mean0.995505214
Median Absolute Deviation (MAD)1
Skewness0.9843159179
Sum99666
Variance0.988463041
MonotonicityNot monotonic
2021-05-28T23:04:36.252217image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
136929
36.9%
036890
36.8%
218391
18.4%
36046
 
6.0%
41533
 
1.5%
5286
 
0.3%
632
 
< 0.1%
79
 
< 0.1%
ValueCountFrequency (%)
036890
36.8%
136929
36.9%
218391
18.4%
36046
 
6.0%
41533
 
1.5%
5286
 
0.3%
632
 
< 0.1%
79
 
< 0.1%
ValueCountFrequency (%)
79
 
< 0.1%
632
 
< 0.1%
5286
 
0.3%
41533
 
1.5%
36046
 
6.0%
218391
18.4%
136929
36.9%
036890
36.8%

f2f_total
Real number (ℝ≥0)

ZEROS

Distinct23
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.90197371
Minimum0
Maximum22
Zeros33974
Zeros (%)33.9%
Negative0
Negative (%)0.0%
Memory size782.3 KiB
2021-05-28T23:04:36.349166image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q33
95-th percentile6
Maximum22
Range22
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.155378666
Coefficient of variation (CV)1.133232628
Kurtosis5.601702041
Mean1.90197371
Median Absolute Deviation (MAD)1
Skewness1.808003049
Sum190418
Variance4.645657194
MonotonicityNot monotonic
2021-05-28T23:04:36.442767image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
033974
33.9%
118825
18.8%
216083
16.1%
312435
 
12.4%
47979
 
8.0%
54653
 
4.6%
62516
 
2.5%
71449
 
1.4%
8850
 
0.8%
9480
 
0.5%
Other values (13)872
 
0.9%
ValueCountFrequency (%)
033974
33.9%
118825
18.8%
216083
16.1%
312435
 
12.4%
47979
 
8.0%
54653
 
4.6%
62516
 
2.5%
71449
 
1.4%
8850
 
0.8%
9480
 
0.5%
ValueCountFrequency (%)
222
 
< 0.1%
214
 
< 0.1%
205
 
< 0.1%
196
 
< 0.1%
1818
 
< 0.1%
1726
 
< 0.1%
1629
 
< 0.1%
1543
< 0.1%
1469
0.1%
1395
0.1%

prescription_total
Real number (ℝ≥0)

ZEROS

Distinct32
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3535898358
Minimum0
Maximum42
Zeros86554
Zeros (%)86.5%
Negative0
Negative (%)0.0%
Memory size782.3 KiB
2021-05-28T23:04:36.542281image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum42
Range42
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.291228808
Coefficient of variation (CV)3.651770152
Kurtosis112.2690004
Mean0.3535898358
Median Absolute Deviation (MAD)0
Skewness7.950297873
Sum35400
Variance1.667271836
MonotonicityNot monotonic
2021-05-28T23:04:36.638168image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
086554
86.5%
14981
 
5.0%
24108
 
4.1%
31786
 
1.8%
41031
 
1.0%
5565
 
0.6%
6339
 
0.3%
7199
 
0.2%
8151
 
0.2%
999
 
0.1%
Other values (22)303
 
0.3%
ValueCountFrequency (%)
086554
86.5%
14981
 
5.0%
24108
 
4.1%
31786
 
1.8%
41031
 
1.0%
5565
 
0.6%
6339
 
0.3%
7199
 
0.2%
8151
 
0.2%
999
 
0.1%
ValueCountFrequency (%)
421
 
< 0.1%
401
 
< 0.1%
351
 
< 0.1%
342
 
< 0.1%
303
 
< 0.1%
295
< 0.1%
262
 
< 0.1%
252
 
< 0.1%
2310
< 0.1%
223
 
< 0.1%

webinar_total
Real number (ℝ≥0)

ZEROS

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08218466579
Minimum0
Maximum9
Zeros93792
Zeros (%)93.7%
Negative0
Negative (%)0.0%
Memory size782.3 KiB
2021-05-28T23:04:36.728198image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.3676987394
Coefficient of variation (CV)4.474055299
Kurtosis69.75802141
Mean0.08218466579
Median Absolute Deviation (MAD)0
Skewness6.790673379
Sum8228
Variance0.135202363
MonotonicityNot monotonic
2021-05-28T23:04:36.803989image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
093792
93.7%
15061
 
5.1%
2876
 
0.9%
3251
 
0.3%
467
 
0.1%
541
 
< 0.1%
714
 
< 0.1%
611
 
< 0.1%
82
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
093792
93.7%
15061
 
5.1%
2876
 
0.9%
3251
 
0.3%
467
 
0.1%
541
 
< 0.1%
611
 
< 0.1%
714
 
< 0.1%
82
 
< 0.1%
91
 
< 0.1%
ValueCountFrequency (%)
91
 
< 0.1%
82
 
< 0.1%
714
 
< 0.1%
611
 
< 0.1%
541
 
< 0.1%
467
 
0.1%
3251
 
0.3%
2876
 
0.9%
15061
 
5.1%
093792
93.7%

gender_female
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size782.3 KiB
1
65616 
0
34500 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters100116
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
165616
65.5%
034500
34.5%

Length

2021-05-28T23:04:36.980026image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-28T23:04:37.036542image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
165616
65.5%
034500
34.5%

Most occurring characters

ValueCountFrequency (%)
165616
65.5%
034500
34.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100116
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
165616
65.5%
034500
34.5%

Most occurring scripts

ValueCountFrequency (%)
Common100116
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
165616
65.5%
034500
34.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII100116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
165616
65.5%
034500
34.5%

gender_male
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size782.3 KiB
1
65616 
0
34500 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters100116
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
165616
65.5%
034500
34.5%

Length

2021-05-28T23:04:37.185592image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-28T23:04:37.241285image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
165616
65.5%
034500
34.5%

Most occurring characters

ValueCountFrequency (%)
165616
65.5%
034500
34.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100116
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
165616
65.5%
034500
34.5%

Most occurring scripts

ValueCountFrequency (%)
Common100116
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
165616
65.5%
034500
34.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII100116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
165616
65.5%
034500
34.5%

office_or_hospital_based_Hospital
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size782.3 KiB
0
88980 
1
11136 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters100116
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
088980
88.9%
111136
 
11.1%

Length

2021-05-28T23:04:37.400234image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-28T23:04:37.459721image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
088980
88.9%
111136
 
11.1%

Most occurring characters

ValueCountFrequency (%)
088980
88.9%
111136
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100116
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
088980
88.9%
111136
 
11.1%

Most occurring scripts

ValueCountFrequency (%)
Common100116
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
088980
88.9%
111136
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII100116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
088980
88.9%
111136
 
11.1%

office_or_hospital_based_Office
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size782.3 KiB
1
88980 
0
11136 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters100116
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
188980
88.9%
011136
 
11.1%

Length

2021-05-28T23:04:37.617365image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-05-28T23:04:37.672976image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
188980
88.9%
011136
 
11.1%

Most occurring characters

ValueCountFrequency (%)
188980
88.9%
011136
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number100116
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
188980
88.9%
011136
 
11.1%

Most occurring scripts

ValueCountFrequency (%)
Common100116
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
188980
88.9%
011136
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII100116
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
188980
88.9%
011136
 
11.1%

Interactions

2021-05-28T23:04:24.819113image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.016534image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.104432image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.200441image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.298051image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.392235image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.489905image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.581753image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.673221image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.763944image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.854327image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:25.946061image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:26.043981image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:26.143016image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:26.237750image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:26.339623image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:26.446742image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:26.541446image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:26.786452image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:26.887088image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:26.986778image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:27.093407image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:27.201367image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:27.304805image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:27.413191image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:27.514428image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:27.615647image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:27.716764image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:27.817992image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:27.920385image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:28.027588image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:28.136442image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:28.245175image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:28.353291image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:28.455608image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:28.557821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:28.659245image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:28.753801image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:28.849575image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:28.951194image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:29.053821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:29.152326image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:29.254503image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:29.349762image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:29.446190image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:29.541630image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:29.642177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:29.743865image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:29.850452image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:29.959010image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:30.063152image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:30.332372image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:30.434407image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:30.533650image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:30.634125image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:30.725744image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:30.818682image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:30.917739image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.016582image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.111830image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.210153image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.301790image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.396353image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.488681image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.581364image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.675525image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.774028image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.874322image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:31.969524image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:32.069151image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:32.161545image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:32.254892image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:32.347383image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:32.440021image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:32.532277image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:32.630362image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:32.729890image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:32.824210image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:32.924050image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:33.016826image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-05-28T23:04:33.109995image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-05-28T23:04:37.750328image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-28T23:04:37.974834image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-28T23:04:38.161302image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-28T23:04:38.350558image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-05-28T23:04:38.512668image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-05-28T23:04:33.328593image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-05-28T23:04:33.747053image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

idcustomer_idrep_idis_cardiologistis_gpyears_since_graduationtime_window_idconference_totalemail_open_totalf2f_totalprescription_totalwebinar_totalgender_femalegender_maleoffice_or_hospital_based_Hospitaloffice_or_hospital_based_Office
0010010001231004001101
1010010001232002001101
2010010001233014001101
3010010001234108001101
4010010001235038001101
5010010001236006321101
6010010001237014021101
7010010001238007031101
8010010001239109011101
901001000123100310011101

Last rows

idcustomer_idrep_idis_cardiologistis_gpyears_since_graduationtime_window_idconference_totalemail_open_totalf2f_totalprescription_totalwebinar_totalgender_femalegender_maleoffice_or_hospital_based_Hospitaloffice_or_hospital_based_Office
100106834884481150133000001101
100107834884481150134000001101
100108834884481150135000001101
100109834884481150136030001101
100110834884481150137000001101
100111834884481150138010001101
100112834884481150139001101101
1001138348844811501310003001101
1001148348844811501311014001101
1001158348844811501312000001101